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In linear-real coding, the transmitted signals are (possibly redundant) 
linear combinations of the data signals. The linear combination oj data 
signals can have a block pattern, resulting in linear-real block coders, or 
a stationary pattern, resulting in linear-real stationary (shift-register) 
coders. Stationary coding is shoum to be a limiting case of block coding. 
Both methods appear to be practical for the control of burst and impulse 
noise. However, stationary coding appeal's to have some advantages and 
is the only one we study here. We propose shift register implementations 
which promise the required precision and dispersion at less cost than 
tuned RLC circuits. 

Error properties of both block and stationary coders are similar, but 
it is easier to learn concepts by analyzing the block coders. When the receiver 
is able, by using some of the techniques we discuss, to estimate the noise 
covariance matrix for each codeblock, the resulting noise power is less than 
that for receivers not using the statistics for each codeblock. 

Nonlinear memoryless filters, such as clippers, are especially effective 
when used with linear-real coders. We propose a memoryless filter which 
attenuates the input signal more severely when a second input to the filter 
indicates the channel is having a noise burst. If the memoryless filter is 
designed for the worst case noise, then performance will not degrade icith 
decreased noise when the nonlinearity is odd and monotonic. 

I. INTRODUCTION 

Many communications channels, including telephone channels, con- 
tain noise which comes in short bursts, such as noise from impulses. 
Such noise is particularly deleterious when the channel is used for 
the transmission of digital data. 



* Part of the research for this article was performed at Carnegie Institute of 
Technology under National Science Foundation grants GP-39 and GK-373. 
Some of the material contained in this paper is taken from the author's con- 
vention article. 
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At least as early as 1958 it was discovered that it is sometimes 
possible to reduce digital errors in such channels without reducing the 
noise power by using a scheme such as Fig. 1 shows. In some formu- 
lations 1-7 the transformation A consisted of a continuous all-pass filter 
whose Fourier transform magnitude was unity at all frequencies but 
whose phase characteristic varied with frequency; the inverse linear 
transformation was the continuous all-pass filter with the conjugate 
phase characteristic. The linear filter was called the smear operation, 
and its inverse the desmear operation. Later papers considered linear 
transformations to be real-number matrices operating upon the data 
in blocks. 8 - 10 

In all schemes to which Fig. 1 applies, a single impulse of noise 
into the inverse linear filter will be transformed into an output noise 
which is dispersed in time. With proper design, this dispersed noise will 
be small enough at all times to not produce errors at the output of 
the quantizer. 

Our purpose is to investigate coding schemes which fall in the 
general pattern of Figure 1 to gain conceptual insight and learn practi- 
cal design. Such study is useful because the practicality of the matrix 
version has never been studied, and the continuous all-pass filter was 
limited by cost and filter imprecision. The shift registers we might 
propose avoid the problems which hindered the application of contin- 
uous all-pass filters. 

We show that the real-number linearity of the transformations of 
Fig. 1 will permit the receiver to use any available information about 
noise correlation or position. All of the proposed means for using this 
information are simple in concept, and some are simple to implement. 



II. DESCRIPTION 

Linear-real block coding is a form of coding in which A, an n by k 
matrix of real numbers, is used to produce an output vector b from an 
input vector r according to the equation 

b = At. (1) 
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Fig. 1 A general arrangement, for placing linear filters A and A' 1 to reduce 

digital errors, 



LINEAR-REAL CODERS 1067 

If n > k, then b will be redundant in the sense that not all of its com- 
ponents are independent. The word "real" is used in order to emphasize 
the fact that the arithmetic in equation 1, and all other equations in 
this paper, is real number arithmetic. The use of real number arithmetic 
distinguishes this work from generalized parity-check coders which are 
linear in finite-field arithmetic. 

Stationary (shift register) linear-real coding is a limiting case of 
linear-real block coding, but is best described as being the convolution 
summation given by 

h = E fc.-,r,. (2) 

J --0O 

where b f is the i tb signal transmitted, r t is the i th data number, and 
where h q can naturally be called the unit pulse response of the en- 
coding filter at time-step q. 

The conclusions to be reached on practical applications are that 
moderate cost encoders and decoders of considerable use for burst and 
impulse noise channels can be built as soon as low-cost tapped digital 
delay lines are available. Magnetic domain-wall digital delay lines, 11 
for example, might well make these coders practical. 

There are two general ways in which noise is controlled by means 
of linear-real coding. We give the complete details and mathematics 
later. Briefly, the qualitative aspects are: 

The total noise 'power in the decoded signal is made less than that 
without coding. We discuss three distinct ways of doing this: 

(i) When linear-real block coding is used, and when the noise 
covariance matrix is known (or can be adaptively deduced by the 
receiver) then this knowledge can be used to reduce the noise power. 
It can be correlation type knowledge, as accounts for the effectiveness 
of Wiener filtering. If the noise process is a posteriori nonstationary, 
then a receiver which estimates the noise correlation matrix for each 
code block may effectively use the available information on the po- 
sition of burst noises within the block. This is particularly effective in 
burst noise channels having block coders using rectangular A matrices. 

(ii) A stationary memoryless nonlinear filter (such as a clipper) can 
be used to reduce the noise power before the inverse linear transforma- 
tion is applied. Such a filter would of course reduce noise power in 
the absence of an inverse filter when it immediately precedes the 
quantizer, but it would not then reduce errors. When placed before 
the inverse transformation, the stationary memoryless nonlinear filter 
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reduces both errors and noise power. We refer to equations for analyz- 
ing design and performance of the memoryless nonlinear filter. A simu- 
lation example in Section VI shows these devices to be surprisingly 
effective. 

(Hi) A memoryless nonlinear filter can be used which has both the 
noisy signal and an estimate of the instantaneous noise power for 
inputs. The output is an optimized estimate of the signal given the 
estimated instanteous noise power. This filter always reduces noise 
power, as does the filter in method u, and only reduces errors if 
there is a filter such as the inverse linear transformation between it 
and the quantizer. We describe several methods for estimating the in- 
stantaneous noise power in Section V. One of these, which appears in 
Fig. 6, uses the fact that practical pam signals have more bandwidth 
then the Nyquist bandwidth for their pulse interval. 

The remaining noise power is distributed more evenly among all 
decoded signal components and (in the limit of infinite smearing) 
made Gaussian. This type of noise control is especially effective in 
quantized-signal burst and impulse noise channels which have a 
thermal noise which is small compared with the separation between 
quantization levels. In this case a burst noise with power which is 
small compared with the thermal noise would be unable to produce 
many errors if it were evenly dispersed, although it could when 
bunched up. Dispersal of the burst noise power is sometimes un- 
favorable, but if the noise power is reduced enough and the noise 
dispersed enough, then the effect is very favorable. The decoding op- 
eration also tends to make the decoded signal have a Gaussian first- 
order probability distribution, which reduces the probability of a 
large peak and thereby reduces errors for quantized signals. 

The design equations for the nonlinear memoryless filter (clipper) 
to which we refer assume a known probability distribution on the 
noise, as does the simulation reported. In practice, the actual noise 
can be less noisy than that used for design purposes, and the resulting 
mean square error will not be larger than that with the design noise, 
provided the noise probability density is even and the nonlinearity 
has certain properties. We give precise details in Appendix D. 

III. BLOCK CODES AND THEIR NOISE COVARIANCE MATRIX 

In general, assuming r and c are independent zero-mean column 
vector random variables, which represent the signal to be encoded and 
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the channel noise, respectively, and assuming r and c have nonsingular 
covariance matrixes Q and N, respectively, and assuming that f = 
b + c is decoded by some linear operator T, where b is given in equation 
1, then a straightforward evaluation of the covariance matrix of u = 
r — Ti will show that 

M = E[uu'] 

= (I (kxk) - TA)Q(I ikxk) - TA)' + TNT' (3) 

where ( )' denotes the transpose of a matrix or column vector. This 
formula can be used to compare the performance of encoder-decoder 
pairs with good and bad choices for matrix A, and good and bad 
choices of matrix T. 

Table I shows three possible T matrices. The first was shown to be 
the least mean square linear estimator in (9), and for Gaussian 
signal and noise gives the conditional mean of the transmitted vector 
given the received vector. The second is the first evaluated for infinite 
signal power in all degrees of freedom (which implies Q~ l = 0) 
and produces a decoded error uncorrelated with the signal. The third 
does not require the use of the N matrix. All assume the columns of 
A to be linearly independent. 

Table II gives further insights into the behavior of the decoded error 
by presenting a number of special cases of equation (3). The justifica- 
tion of the equations of Table II is given in Appendix A. In one of 
the special cases in Table II, namely when equation (7) applies, the 
decoded noise energy is proportional to the arithmetic mean of the 
received noise energy. In other cases, such as that of equation (12), 
the eigenvalues of A'N^A play a crucial role in formulas for the mean 
square decoded noise. 

Equation (13) of Table II shows that the average of the eigen- 
values of A t N~ 1 A appears in a formula for a lower bound for the mean 



Table I — Three Different Linear Operators for 
Decoding / Into r. 



Name 


Formula 


Mean estimator 

(Gives least mean square error) 


T 


= (Q" 1 + A'N^A^A'N' 1 


Unattenuated estimator 


T 


= (A'N-'A^A'N- 1 


Unadaptive estimator 

(The generalized inverse of A) 


T 


= (A'A)-*A* 
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Table II — Some Special Cases of the Error Covariance 
Matrix op Equation (3) and the Resulting Mean Squ are Error 

Unadaplive Estimator 

M = (A'AytA'NAKA'A)- 1 ]'. (4) 

m.s. error = 1/k tr (A'Ay'A'NAlU'A)- 1 ]'. (5) 

When the columns of A are orthogonal and each of length (n/A;)': 

M = (k/n)*A l NA. (6) 

When in addition N = diag (m , n, , • • •, n„), and AM is the arithmetic mean of 
these m's, and A is l/(Jfc)» times the first k columns of a Hadamard matrix (see 
Appendix A for a definition): 

m.s. error = Ma = (k/n)AM. (7) 

Unattenuated Estimator 

M = (A'N-'A)- 1 . (8) 

m.s. error = 1/k tr (iWA)" 1 . (9) 

Mean Estimator 

M = (Q' 1 + A'N-hl)- 1 . (10) 

m.s. error = 1/ft tr (Q" 1 + A'N~ l A)-K (U) 

Mean Estimator (fl = i) or Unattenuated Estimator (Q =• 0) 

m.s. error = 1/fc g x . (n Q-i + AW^) ° 2) 

where X,(Z) denotes the i lh unordered eigenvalue of Z. Special case of above when 
Q = si, s scalar: 

m.s. error = \/k g ^-i + Xi (A'N^A) ~ ~ ~k (13) 

iis-' + 1/k ± UA'N~ l A) 
»-i 

Special case of equation (12) when Q = si, and A is square, orthogonal, and each 
column has length (n/fc)*: 

m.s. error - 1/k £ ^— : (I 4 ) 

,=1 s-> 4- 

The following assumptions are referred to as equation ( 15) : 

Q = 1 : T is the mean estimator. 

fi = 0: jP is the unattenuated estimator, and A l N~ x A is positive definite. 

Q = si, s scalar. 

A is l/(fc) J times the first fc columns of an n X n Hadamard matrix. 

N = diag (ni , 712 , • • • n„). 

The n, variables are independent, identically distributed random variables such 
that E(l/rn) exists, has finite variance a\ and the harmonic mean of the ffc 
variables 

HM = I 1/n E 1/n,- I (15) 



fl/n £l/n.- 



is finite. 
k is large enough for the weak law of large numbers to apply. 



Assuming equation (15): 
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< m.s. error (16) 



tts-' + 



kHM 



0.5 , 0.5 ,._, 

m.s. error < p= H r= (W; 



5-1 + khr ~ \Jt 



■2„ 
kHM ~ VT ° " ~ r £~™ T V V 



provided that the first denominator is positive, where t is given by equation 35 
of Appendix I. 



square error; furthermore, that the mean square error equals this 
lower bound only when all the eigenvalues are the same. Thus the 
deviations of the eigenvalues of A*N~ X A detemine the closeness of the 
lower bound of equation (13), which Appendix A shows is sometimes 
related to the harmonic mean of the eigenvalues of N, which appears 
inequations (16) and (17). 

A geometric illustration of the eigenvalues of A?N~ X A for rectangu- 
lar A with orthonormal columns begins with the observation that the 
eigenvectors of A/ -1 form the semiaxes of an 7i-dimensional ellipsoid. 
The projection of this ellipsoid by the transformation A 1 forms an- 
other ellipsoid, which will be called the /c-dimensional shadow of the 
original n-dimensional ellipsoid.* 

The semiaxes of the shadow ellipsoid have the lengths of the eigen- 
values of A*N~ l A. In order for the equation (13) bound to be close 
to the actual value, the semiaxes of the shadow ellipsoid have to be 
generally near their mean length; in other words, the shadow has to 
be round. A sufficient condition for the shadow to be round is that 
the ellipsoid is the shadow of a round ellipsoid, but this is not neces- 
sary. For some of the possible spacial orientations, for example, a 
football's shadow is rounder than the football. 

IV. THE LIMITING CASE OF STATIONARY (SHIFT REGISTER) CODING 

The purpose of this section is to show that — in the limit — all linear- 
real coding and decoding operations can become time stationary, so 
that they can be implemented by shift registers with time-invariant 
impulse responses. The limit is taken in the sense that the transmitted 
digits are obtained as a single block code whose output is a column 



* An ordinary planar shadow of a three-dimensional object will be an orthogo- 
nal projection only when the light rays are parallel, and are normal to the 
plane of the shadow. 
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vector with components from — n to n, where n approaches infinity. 
There are two reasons why a study taking linear-real coding to the 
limit of being time stationary can be advantageous or useful: 

(i) Stationary encoders and decoders appear to be more economical 
to implement than the block type of encoders and decoders. 

(it) The mathematical investigations to be made in the passage to 
the limit will add insights to linear-real coding by showing that a 
special case of it is Wiener filtering, and will add insights to Wiener 
filtering by showing that a Wiener filter is related to the least mean 
square estimator of matrix-encoded noise data vectors. 
Toeplitz matrices, denned later, and Z-transforms (Ragazzini and 
Franklin), 12 are our main mathematical techniques to reach these 
ends. 

4.1 Stationary Coders 

The transmitted signal b t is assumed to be obtained from the data 
stream Tj by the convolution summation of equation (2), which can 
be put in matrix form by means of the doubly infinite vectors 



b = 



6-! 

b 
b, 



r = 



r-i 
r. 



etc., 



and the Toeplitz matrix (defined in section 4.2) 

An = a,_j = /t,_,- 

so that equation (2) can be expressed in matrix form by 

b = At. 

The problem of how to perform the infinite matrix multiplications, 
either analytically or with hardware, will be shown to be solvable by 
the use of Z-transforms. 

4.2 Infinite Toeplitz Matrices 

An infinite matrix A, with elements A ti , i, j = 0, ±1, ±2, • • • , 
will be called Toeplitz* if some sequence . . . , a- v , a , a, , . . . exists 

♦Hermitian matrices of the type of Equation (14) are called Toeplitz forms, 
and are described by Grenander and Szego. 18 The Hermitian property is not 
assumed in this paper's definition, since it is not needed for some of the results. 
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such that 

A i4 = a ( ,-„ (18) 

for all i, j. Associated with this Toeplitz matrix will be the two- 
sided Z-transform 



o(«) = J2 a«z 



(19) 



The convergence properties of Toeplitz matrices could prove trouble- 
some in some cases, but in this paper most difficulties will be avoided by 
using only those matrices whose associated Z-transform, according to 
equations (18) and (19), has all its poles some finite distance from the 
circle \z\ = 1, and which is absolutely convergent on | z \ — 1. (If the 
matrix is to be inverted, it also must have its zeros some finite distance 
from | z | = 1.) 

Any poles outside | z \ = 1 arise from a a sequences which are nonzero 
for q < 0. This should not cause alarm, as noncausality of unit pulse 
reponses for decoders is not a serious practical obstacle, since actual 
noncausal unit pulse responses can be arbitrarily well approximated 
by accepting a decoding delay. These restrictions on the poles of the 
associated Z-transforms require that a, be bounded by a geometrically 
decreasing sequence as q — * ± oo . 

Section B.l of Appendix B presents theorems which are useful in 
relating Toeplitz matrix operations to Z-transforms, and shows how 
least mean square matrix operators of the Toeplitz type can be related 
to Weiner-nlter types of sampled data estimators. 

4.3 Error Analysis 

When A, Q, and N are Toeplitz and nonsingular, the expressions for 
the mean square error equivalent to the equations of Table 2 are 

7W,n = (Q _1 + A'N-'Ay'A'N- 1 
or 



'mean(z) — 



q(z)a[- 



gives 



m.s. 
error 



n{z) + a\QJq{z)a(z) 



on diagonal component 
LofM MKAN = (Q- x + A'N- l A)-\ 



(20) 
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or 



m.s. 



error 



= Z 



q(z)n(z) 



n(z) + a[-)q(z)a{z) 



(21) 



where Z _1 is the inverse Z-transform integral operator. When A is either 
finite or Toeplitz but nonsingular, Tunattenuated and 7\jnadaptivk 
give the same decoding matrix, namely A' 1 , which will be called 

^INVERSE • 

T- INVERSE = A 

or for the Toeplitz case 



fa 



NVEHBE 



«(*) = 



a(z) 



gives 



m.s. 
error 

m.s. 
error 



on diagonal component 
Lof A- l N(A- 1 )' 



= z 



n{z) 



a(z)a[- 



(22) 



(23) 



The above error can be evaluated by these three methods: 

(i) Truncate A and N and then compute an on-diagonal component 
of (A'N^A)' 1 near the center of the matrix. 

(u) Use Z-transforms to find *unattenuated(z)- Invert the Z-trans- 
form by either 

(a) Using the inversion integral for Z-transforms, or 
(6) Using pole-zero expansions and a small table of Z-transforms. 

Method (ii-a) is the Z-transform analog of using Parseval's theorem to 
find mean square errors of stationary nonsampled systems. 

Lemma 1: When A is Toeplitz ivith columns orthogonal and of length 1, 
then 

(a) A 1 A = I 

(b) a(^)a(z) = 1. 
The proof is trivial. Also notice that (a) <=> (6). 
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Corollary 1 : When T = A ' and A is orthogonal, 



m.s. 
error 



= m.s. noise. 



For design purposes it is desirable to make the following definitions; 
both assume T = A' 1 which is assumed to exist. 



For Toeplitz A and N: 

on diagonal component 
_of A~ l N(A" l Y 



noise 

power 

amplification 



on diagonal component 
of A' A 



[on diagonal component of N] 



For Block Coders: 



noise 

power 

amplification 



Q 



tr A-'NiA' 



\i 



rtrN 



k 



■*] 



(24) 



(25) 



Physically, this corresponds to the actual amplification of noise in a 
channel which encodes with a matrix proportional to the A matrix, 
where the proportionality constant is selected to make the encoder 
give unity power amplification to a white signal, and where the decoder 
is Inverse • For the stationary coder and channel, the Z-transform 
version is: 



Z 



,J "(«> J 



noise 

power 

amplification 



a(z)a[- 



'/: 



a\z)a\- 



(26) 



The block code version of the trace formula can also be used to show 
that if the impulse response of the stationary encoder is . . . o_x , a , a, , 

. . . , and its inverse is . . . 6_! , b , b x , . . . , so that a q *b Q = 5,, , then 
for N oc / w the noise power amplification can be evaluated from the 
impulse reponses by: 



noise 
power 

amplification 
(for white noise) 



[Zalltftjl (27) 
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The Z-transform version for N ce I M is: 



noise 

power — . I I 



= Z 



amplification \ / \ (l\ \ 

(for white noise) K z)a wl 



*)«(±)} 



(28) 



It can be readily seen from equation (26) that: 

Lemma 2: When A is Toeplitz, the noise -power amplification ivill be 
unity whenever 

a(z)a\-J = constant 

whether or not the noise is white, so long as it is Toeplitz. 

An equivalent statement is that when A and N are Toeplitz, a sufficient 
condition for the noise power amplification to be unity is that A A = /„ , 
which is equivalent to a(z)a(l/z) = constant. 

The above lemma will be seen to be especially significant after it 
is proved that unity noise power amplification is the least^which can 
ever be obtained, and when it is shown that simple a{z) functions, 
namely all-pass fimctions, obey the conditions of the lemma. Notice 
that the noise power amplification definition was based upon a receiver 
which performed the inverse of the encoding operation, and not upon 
a receiver which made a least square estimate of the signal given the a 
posteriori noise statistics. Consequently, statements about least possible 
noise power amplification are not applicable to adapative types of 
receivers such as those employing T MEAN . 

The following theorem is for block codes with n = k. 

Theorem 1: When square block coding is used and N is proportional to 

the identity, then the noise power amplification is always greater than or 

equal to one, and it is one only when A is proportional to an orthogonal 

matrix. 

Proof: What is required is a demonstration that: 

(i) p [tr A-'iA-ntv AA'] £ 1 (29) 

and 

(h) Equality occurs if and only if A is proportional to an orthog- 
onal matrix. (30) 

These are established in Section 2 of Appendix B. 



LINEAR-REAL CODERS 1077 

The following corollary is the Toeplitz matrix limit version of the 
above. 

Corollary 2: When A and N are Toeplitz, and N is proportional to 
I„, a necessary and sufficient condition for unity noise power amplifi- 
cation is that A'A = I x , which is equivalent to a (1/z) a(z) = con- 
stant. Otherwise the noise power amplification is greater than one. 

When stationary (shift-register) linear-real decoding is used, then 
the decoding filter passes the noise through a Z-transform transfer 
function. When the noise is statistically stationary, the expected 
value of the mean square of the output noise is stationary, and de- 
pends only upon the amplitude of the transfer function averaged over 
the values of z. However, for burst noise the variance of the mean 
square of the decoded noise does depend upon the phase of the trans- 
fer function. For burst-noise or impulse-noise channels, this variance 
is minimized if the impulse response from the noise to the analog 
output of the decoder consists of many small terms instead of a few 
big ones. 

For quantized signals it is important to minimize the variance of 
noise power because fluctuations above the mean of the variance in- 
crease the error rate far more than fluctuations below the mean of 
the variance decrease it. In order to make the variance of the noise 
power small, the impulse response from noise to analog output must 
be near its peak for many times longer than the periods of fluctua- 
tion in the noise process. 

Because trace and expected value operators commute, the expected 
value of the output mean square error can be found by substituting 
E(N) where N appears, provided the noise process is stationary. This 
cannot be done for error probabilities after the quantizer, however. 

4.4 All- Pass Z-Transforms 

A Z-transform a(z) is defined to be all-pass if \a(z)\ - constant 
for \z\ = 1. These are the Z-transform version of two-sided Laplace 
(or Fourier) transformed all-pass functions. Figure 2 shows some im- 
portant properties of all-pass Z-transforms, including the fact that 
a {z) a ( 1/z) = constant is an alternative definition of an all-pass 
Z-transform. The proofs of relationships in the figure not proved 
previously are straightforward. The practical implications of these 
relationships are that all stationary (shift-register) linear-real coders 
should have Z-transforms which are all-pass, in order not to in- 
crease the noise power amplification. 
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V. A. Kisel' has made an excellent short study of all-pass Z-trans- 
forms, with a view toward using them as phase-correcting networks. 11 
He has shown that networks whose Z-transform transfer function 
are of the form 

„ m - I + gig + gg! + §d 

{Z) ft + fe + ^ + z 3 ~ 

are all-pass, and that Fig. 3 synthesizes such functions. Additional 
modifications are added to this basic structure and implementations 
are proposed in the next section. 

V. IMPLEMENTATION STUDIES 

The decoder for block coding with adaptive mean decoding appears 
to require a large modern digital computer, and even then it could 
probably only operate "on line" with a slow channel and a block 
size not much over one hundred. Further research may lead to A 
matrices for which (Q~ l + A<N~ l A) can be easily inverted for realistic 
Q and A r , or further research may lead to quicker inversion proce- 
dures, but with the present techniques, block coding with adaptive 
mean decoding appears to be decidely less practical than other meth- 
ods of error control. 

The decoder for unadaptive block decoding appears to be generally 
feasible if certain simplifying techniques are used. The most impor- 
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Fig. 3 — A shift register (real-number arithmetic) whose Z-transform transfer 
function is all-pass. (After V. A. Kisel', with modifications and a correction.) 
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tant of these is the use of an A matrix which is a permutation ma- 
trix* times diagUo, A Q , . . . . , A ), where A is itself a matrix. Aq 
must be large enough to give adaquate smear, whereas A must be 
large enough to make error burst lengths considerably shorter than 
the length of a code word. The hardware simplification achieved is 
that the inverse of the small A can be repeatedly applied in time by 
the same hardware so as to invert the larger A. The practicality of 
block coding appears to be slightly overshadowed by stationary 
(shift register) coding, which offers somewhat simpler circuits and 
freedom from the problem of block synchronization. 

Stationary (shift register) coding appears to be the most practical 
form of linear-real coding. In effect, such coding is a smear-desmear 
type of signal processing whenever the encoding and decoding filters 
are inverses of each other and of the all-pass type. The fundamental 
reason for the practicality of shift register all-pass filters is that 
accurately tuned shift registers can be relatively inexpensively 
synthesized, even when the dispersion times are several seconds. This 
is partly so because the "absolute" tuning of a shift register is deter- 
mined by the clock pulses and not the precision of the components 
used in making the register, and partly because the "relative" tun- 
ing in a shift register is controlled by gains which in practice can be 
resistor values. As will be seen, analog shift registers can be imple- 
mented digitally, in which case complexity grows only as the loga- 
rithm of accuracy. In RLC filter synthesis, in contrast, cost grows 
rapidly with accuracy. 

Figure 4 is a block diagram for coding of the basic stationary (shift 
register) type. The decoder, because it must handle the analog signals 
from the channel instead of the digital input signals, is selected to 
have the impulse response simplest to implement, namely an all-pass 
causal l/a(z) obtained by a shift register made from a tapped delay 
line with a relatively moderate number of taps. The encoder is con- 
sequently left with approximating the noncausal a(z), which it does 
with a delay by means of a tapped delay line. 

The decoding shift register of Fig. 4 can be implemented by the 
arrangement of Fig. 5, which is a particular synthesis of the all-pass 
shift register shown in Fig. 3. In Fig. 5 all the digital-to-analog con- 
version is done by resistor summing networks. This is relatively in- 
expensive, although it does require that the flip-flop registers be de- 
signed for relatively precise voltage levels on the "on" and "off" states. 



* A permutation matrix is a matrix with a single one in each column and 
each row ; it is always nonsingular. 
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Fig. 4 — One possible general arrangement for unadaptive stationary (shift- 
register) linear-real encoders and decoders. For multilevel signals, a Gray encoder 
can be used before the analog summer, and the quantizer would incorporate a 
Gray decoder. 



Notice that in Fig. 5 there is only one analog-to-digital converter, 
because the analog feedback signal is added to the input signal before 
the conversion which is necessary in order to place the signals in 
the digital delay line. 

The cost of the encoding and decoding shift registers will be roughly 
proportional to the amount of smear that they introduce. The amount 
of smear necessary for given performance depends upon the noise 
power. It follows that a considerable economic saving can be obtained 
at given performance if circuits, inexpensive compared to the decoder, 
can be found to reduce the noise during bursts. 

A new circuit with this purpose for PAM systems is as shown in 
Fig. 6. The operation of the circuit requires that the interval between 
signal pulses be longer than the Nyquist interval for the bandwidth 
of the pulse shape. A way to find part of the noise component is to 
sample at the sampling instants, reconstruct the waveform which 
would be transmitted if these sample values were the data-signal 
values, and then subtract this signal from the actual received signal. 
(For proof of this statement, see appendix C.) An estimate of the 
instantaneous noise power can be made directly from those noise 
components which can be found. These components, for example, can 
be used to deduce the presence or absence of a noise burst. The circuit 
in Fig. 6 can obtain some noise components,* provided that the taps 

* Specifically, Fig. 6 obtains the sample values of A(0 of Appendix C at 
t = nT/2, n integer. Notice that by construction, A(?i7 1 /2) = for n even. By 
the sampling theorem, just the samples of A(t) will be sufficient to reconstruct 
A(t) provided that C(w) is zero for |u>| > 2 w/T. 
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Fig. 6 — A stationary (shift register) coder with an adaptive decoder for PAM 
channels with white burst noise and pulse rates less than the Nyquist rate. 



on the delay line represent the PAM pulse value at t = nT/2, n odd. 

The output noise estimate (specifically A(nT/2, n odd, in the lan- 
guage of Appendix C and the previous footnote) is then squared to 
produce the sample variance of the noise; then the sample variance 
function is put through a smoothing filter, as shown in Fig. 6. The 
optimization of this filter is complicated by the absence of an ap- 
propriate error criterion, but Wiener filtering principles could be used 
to optimize a mean square criterion. The problem formulation would 
specify that the sample variance is the true ensemble variance con- 
taminated by small sample-size noise, and that the cross-correlation 
between the halfway sample process and the sample process could be 
found from the autocorrelation function of the channel noise. 

Finally, a two-input nonlinear memoryless filter is used, also shown 
in Fig. 6. It is reasonable to optimize this filter using a mean square 
criterion because in the limit of infinite smearing only the power of 
the noise will be significant because of the smearing and Gaussianizing 
effects of the decoding shift register. Some improvement may be pos- 
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sible by using other criteria, but the details appear to be very dif- 
ficult and are unsolved. 

The general scheme of Fig. 6 appears to be the most economical 
form of linear-real coding when the channel is used for PAM at less 
than the Nyquist rate. Telephone lines are used at less than the Ny- 
quist rate because they are used with signals with nonsharp-cutoff 
frequency characteristics. Radio links can obtain information on non- 
tuned burst noise, such as static, by listening on adjacent frequencies, 
and could therefore provide the smoothed estimate of instantaneous 
noise power, needed as an input to the two-input memoryless filter, 
by other means. Instantaneous carrier-to-noise ratios could be used 
for carrier systems, for example. 

It is also possible to use a different principle of instantaneous noise 
power estimation which does not require a PAM channel used below 
the Nyquist rate. The other principle uses the quantized structure of 
the data stream. It is implemented by a decoder with a "pilot" decoder 
which decodes, followed by an operator which squares the difference 
between the signal and the nearest quantization level, which is then 
smoothed and put into a two-input memoryless filter like that of Fig. 
6, following- which is the regular decoding shift register and quantizer. 
This scheme is probably less practical than Figs. 4 and 6, but it does 
give conceptual insights into some of the signal properties which can 
be used in decoding, especially for burst channels. 

VI. COMMENTS AND SIMULATION RESULTS 

Any sample of the decoded noise is a weighted sum of the random 
channel noises at many other sample instants. When the number of 
terms in this sum approaches infinity and the relative size of the larg- 
est term in the sum approaches zero, the central limit theorem applies. 
It will probably be true that practical designs will not have the con- 
ditions of the central limit theorem fulfilled to the extent that very 
small digital error probabilities can be computed by using integrals 
of the tails of the gaussian distribution. 

Nevertheless, the fact that the decoded noise at any instant is a 
sum of the random channel noises at many instants will tend to make 
the decoded noise have some of the characteristics of a gaussian dis- 
tribution. One characteristic that the decoded noise will have is the 
small probability that the decoded noise is larger than three or four 
standard deviations. This effect of the decoding filter (or matrix) 
will be called the gaussianizing property. 

The use of nonlinear filters in conjunction with linear-real coders 
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is extremely effective, since such filters can considerably reduce both 
the noise power and the probability that the noise has a large peak. 
By reducing the probability that the noise has a large peak, the 
desirable gaussian distribution of the decoded noise occurs with 
smaller matrices, smaller shift-registers, or simpler all-pass filters. 
In the limit when the decoded noise is actually gaussian, the noise 
power is the only significant statistic; the higher-order moments of 
the noise become insignificant due to the gaussian-distributing prop- 
erty of the decoder. It is therefore quite appropriate to design the 
nonlinear filter using a mean square error criterion, as is done in 
Section vm of Reference 9. 

Linear-real coding has features which could greatly improve error 
detection in channels with burst noise. When erasure zones are used 
to detect errors, the gaussian-distributing property of the decoder 
greatly increases the ratio of the probability in the erasure zone to 
the probability beyond the erasure zone. In addition, the noise spread- 
ing gives more opportunities for a signal to land in an erasure zone in 
the presence of impulses or bursts, because of randomness of the 
decoded noise, and, with suitable designs, because of deterministic 
reasons. 

If the communications channel is, in order, digital processor to 
analog transmitter to analog receiver to digital processor, then linear- 
real block coding permits the energy per transmitted data digit to 
be altered by reprogramming the digital processors, instead of physi- 
cally retuning bandwidths of analog equipment. Although this option 
does not in itself affect error control, it perhaps could greatly simplify 
the implementation of adaptive communications systems in which the 
signal energy per digit is adjusted to be appropriate for the transmis- 
sion conditions, message importance, or message load. 

A digital computer simulation was run of an additive-noise channel 
with a linear-real block-code encoder at the input, and several types 
of decoders at the output. Table III shows the results of the simula- 
tion. The listed results are averages. The A matrix is the Hadamard 
matrix which is generated recursively according to the procedure de- 
scribed by Golomb and his colleagues (p. 55, first paragraph in proof 
of Theorem 4.5). 1S The N matrix had zeros in all off-diagonal com- 
ponents, and independent random variables on the diagonals, which 
were 0.3 with probability 0.7 and 8.3 with probability 0.3. In ac- 
cordance with Theorem 4 in Appendix d, these can be worst-case 
values which then give the worst-case decoded mean square error. 

Once the N matrix was generated, the channel noises were gen- 
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Table III — Simulated Performance of Linear-Real Coders 





ms error in 
decoded components 






When receiver 

uses perfect 

N matrix 


When receiver uses 

N = diag (m', • • • , n«') 

where m' = max (0.3, /i* — 1) 




Mean estimator 


0.516 


0.711 


The lower bound of 
equation (13) is some- 
what loose; it gives 
0.297. 


Unattenuated 
estimator 


2.821 


2.821 


Equation (12) has cor- 
rectly predicted that 
the error would be the 
same as that of the un- 
adaptive estimator be- 
cause A is square. 


Un adaptive 
estimator 


2.821 


Equation (7) averaged 
over the possible N 
matrices gives m.s. 
error of 2 . 70. The ran- 
domness of the N 
matrix accounts for 
difference. 


Clip estimator 

parameters 

(1.2, 0.9,4.0) 


1.805 




Clip estimator 

parameters 

(1.0,0.75,3.0) 


1.152 




Clip estimator 

parameters 

(0.8,0.6, 2.0) 


0.771 




Clip estimator 

parameters 

(0.6, 0.6, 1.5) 


0.677 




Clip estimator 

parameters 

(0.5,0.5, 1.3) 


0.645 




Clip estimator 

parameters 

(0.4, 0.5, 1.0) 


0.649 





Channel: Additive noise channel sending +1 and —1 binary numbers and block 
encoding with an A which is Ar* times the first k columns of an n by n Hadamard 
matrix. 



16. 
16. 



Number of words in simulation: 10. Noise type: Zero-mean white Gaussian noise 
has variance 0.3 with probability 0.7 and variance 8.3 with probability 0.3. 
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erated randomly from a Gaussian distribution having the given N 
for a covariance matrix. The clip estimator used a decoder which 
first put each received component through a memoryless nonlinearity, 
and then decoded the resulting components with the unadaptive esti- 
mator. The parameters (x, y, z) indicate that the nonlinearity is a 
continuous odd function having slope 1 for inputs of magnitude less 
than x, and slope y for inputs of magnitude between x and z, and 
slope for inputs of magnitude exceeding z. These parameters can be 
chosen to approximate the least mean square memoryless nonlinear 
filter referred to earlier, or they can be found by a trial-and-error 
procedure with either analysis or simulations to evaluate the resulting 
error. 

The following two conclusions can be drawn from the simulation, 
but it would not be appropriate to generalize them to cases of non- 
square A matrices : 

{i) For intermittent additive impulse noise of the type simulated, 
the simple clip estimator scheme, for appropriate parameters, is 
almost as good as the mean estimator, even though it is unadaptive 
and therefore requires only a simple receiver. 

(u) The use of rather crude algorithms for generating an estimate 
of N appeared to be inferior to clip estimator decoding with appro- 
priate parameters. 
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APPENDIX A 

Justification of Table 2 Equations 

A.l Unadaptive Estimator 

In the case of the unadaptive estimator TA = I{kxk), so equation 
(3) reduces to equation (4) shown in Table II. Now in general, when 
M is the covariance matrix of the decoded noise, the mean square 
error will be the average of the on-diagonal terms of M, or in other 
words, (l/k)tr M. In this way (5) follows from (4). Equation (6) 
follows from (4) because A l A = (n/k)I (kxk) in this case. 

A Hadamard matrix is a square matrix with +1 or —1 elements 
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and orthogonal columns. (Golomb and his associates fully describe 
Hadamard matrices and their application to binary block codes. 15 ) 
In deriving equation (7) , a straightforward evaluation of (6) under 
the assumption of diagonal N gives the result that 

M if = (-) 2 auatjiit . 

\n/ i = i 

assuming : 

T is the unadaptive estimator 

N = diag (n, , n 2 , ■ ■• , n n ). 

The on-diagonal terms of the above can be evaluated by using the 
Hadamard assumption, which causes (a») 2 to equal 1/fc for all I and 
i. This gives 



'-(BBS*] 



assuming: 

T is the unadaptive estimator 

N = diag (?i! , n 2 , • • • , w„) 

A is l/(fc)* times the first k columns of any Hadamard matrix. 
Notice that the term in brackets is AM, the arithmetic mean of the 
set (rii , 7^2 ,...., n n ) . 

A.2 Unattenuated Estimator 

Equation (8) comes from (3) by direct substitution for the T 
matrix. 

A.3 Mean Estimator 

In the case of the mean estimator, 

J«x« -TA = I (kxk) - (Q- 1 + A'N-^A'N-'A 

= /»x« - «T + A'N-'Ay^A'N-'A + <T - Q- 1 ) 

= (Q- 1 + A'N-'Ay'Q- 1 . (31) 

Substituting (Q _1 + A'N^Ayty' 1 for (I (fcxw - TA) in equation (3) 
readily shows that 

M = (Q- 1 + ^'iV- 1 ^)- 1 

assuming T is the mean estimator. 
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A.4 Joint Mean and Unattenuated Estimator 

The unattenuated estimator is the special case of the mean esti- 
mator when Q' 1 -» 0. It is convenient to handle the two cases together 
by using the variable ft = 1 when the mean estimator is used, and 
ft = when the unattenuated estimator is used. 

The next two equations use an approach from Berkowitz. 10 Equa- 
tion (9) or (11) can be simplified by using the fact that, for any 
nonsingular Z, 

where Aj(Z) denotes the i th unordered eigenvalue of Z. The result is 
equation (12). When the signal is white, the relation X((tI + Z) = 
t + k,(Z) can be used, giving the equality in equation (13). When 
ft = 1 the positive semidefiniteness of A*N~ X A causes its eigenvalues 
to be real and nonnegative; when ft = the positive definiteness of 
A*N~*A will now need to be assumed. Because l/(fts -1 + A) is a convex 
upward function of A in the region of possible A, the inequality part 
of (13) follows by convexity. This inequality will prove useful later 
when — under additional assumptions — the term in brackets will be 
found in closed form. 
For square orthonormal A, it follows that A -1 = A f , so 

UA'N- l A) - HA-'N-'A) = UN' 1 ) = ^- 

Equation (14) results when the above is substituted into (13). Notice 
that when ft is zero and N is diagonal, this will reduce to AM. On 
the other hand, when ft is one, this will be less than AM. 

When A is rectangular, the next analysis leads to a closed form 
solution for the average of the eigenvalues of A t N~ i A, under the as- 
sumptions of equation (15), and it also leads to upper bounds upon 
the m.s. error. The exact values of the components of A may enter 
into the formulas for some statistics of the error. However, in the 
first and second moment statistics to be investigated under the par- 
ticular assumptions made, it turns out that the only important prop- 
erty of the A matrix is the inner product between the i th and j tb 
columns. This will always be {n/k) 8,;, independent of the particular 
Hadamard matrix upon which A is based. However, since higher- 
order moments are significant, especially in quantized channels, it is 
likely that some Hadamard matrices might be more useful for prac- 
tical purposes than others. 
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Under the assumptions of equation (15), straightforward calcula- 
tions will show the following. HM is the harmonic mean of the diag- 
onal components of N) equation (15) includes its formula. 

GO A'JT^-HhfJ+F, 



kHM 



where, for large k 



EKY k m = £ 



all i, j (32) 



w #H s ^' (33) 

(m) Var [| tr A'A^a] = p a 2 . (34) 

The above equations are especially useful because they show that 

the average oj the eigenvalues of A'N~ A = jfjij ' 

This can be substituted into equation (13) to prove equation (16). 
Equation (16) becomes an equality when all of the eigenvalues of 
iW _1 i are equal; otherwise the mean square error is greater. 

Because the m.s. error evaluated according to equation (12) re- 
quires the computation of eigenvalues of typically a rather large 
matrix, or the trace formula of (9) or (11) yields little insight, and 
because the bound of equation (16) is a simple closed-form equation, 
the question arises of whether the bound given by (16) is really 
close enough to be used for design and analysis purposes as an equal- 
ity. The analysis which follows will derive an upper bound for the 
m.s. error, which could be used to develop some sufficient conditions 
for near equality of equation (16) 

Let equation (32) be used to define Y ki let A 7 (y fc ) denote 

max |X,(F 4 )| , 
i 

and let t be any number such that 

E|X,(F*)| a 

r * ixw"- (35) 
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Notice that t can always be as large as 1 and never exceeds k. The 
second of the following inequalities is Schur's inequality, which is 
valid for any square Y k . 10 The first comes from (35) . 

r[X'(F t )] 2 fg £ |X,(F t )| 2 ^tt l(^).,f- (36) 

t-i «-i i-i 

Assuming that k is large enough for the weak law of large numbers 
to hold permits (32) to be used to evaluate the above double sum, 
so that with a few manipulations (36) reduces to 

X'(F A ) 5S J5. (37) 

\ T 

By using equations (13), (32), (33), and (37), and a relatively ob- 
vious property of convex functions,* equation (17) is established. 

APPENDIX B 

Relating Teoplitz Matrix Operations with Z-Transforms 
Theorem 2: If 

An = a,_, 
and if 

CO 

a ( z ) = Z) a « z ~" 

a = — oo 

converges on \z\ = 1 and has no poles or zeros for a finite distance 
from \z\ = 1, then A' 1 exists and 



° _1 W = lh' 
a{z) 



Proof: Let 



*»-$$ fOT I-I" 1 - 

* The property is that if j(x) is convex downward, and 

* 

T, x t = 0, max | x t | g R, 



then 



t t/(m + x { ) £ |/(ti - R) + §/(« + R). 
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The assumptions on a(z) cause a q and b„ to have geometrical decay, 
and therefore the following converge absolutely: 

(BA) ti = E B ia A Qi = E &<-«<*«-* • 

Q--00 Q = -00 

Also reducing to the above is (AB) i} . Letting q' = q — j gives 
(BA) ti = (AB) {i = £ &ow)- a -<V = K*a tt |,-_,- 

q' « — CO 

where the * denotes the convolution sum in the line above. Because 
b(z)a{z) = 1, it follows that 

bq* a„ |,_, = di,j . 

So BA = AB = I*,, thus proving that B is the inverse of A, which 
completes the proof. 
The following have proofs similar to that of the theorem. 

Lemma 8: If A and B are Toeplitz, then C = AB is Toeplitz with 

c(z) = a(z)b(z). 

Lemma 4: The half-power of a Toeplitz matrix N can be defined by 

n h {z) = \/n(sJj. 

The following has a straightforward proof: 

Lemma 5: If A is Toeplitz, then A' is Toeplitz and a*(z) = a(l/z). 

The following relates linear-real coding for Toeplitz matrices with 
Wiener filtering. 

Theorem 8: When A, Q, and N are infinite Toeplitz, then the least 
mean square estimator 

T = (Q- 1 -f- A'N-'Ay'A'N- 1 

is the infinite Toeplitz, and the noncausal Wiener filter, given by 

q(z)a(- 

m = 



n{z) + a(-jq(z)a(z) 
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t(z) = 



i + a Kz) a{z) 



"WW 



q(z) n{z) 

This equals the stated result, which completes the proof. 

Corollary 3: When A = I K 

T = (Q- 1 + N-y'N' 1 
and 



g(g) 
g(z) + n(a) 



m 

is the noncausal Wiener filter. 

The following proof of equation (29) and statement (30) follows 
the ideas of J. E. Mazo. For square A, 

tr [A' l (A" l Y] = tr [(A- 1 ) 1 A' 1 ], 
since in general tr HC = tr CH for square H and C. Now let B = 
A A 1 . Notice that (A- 1 )* A- 1 is B" 1 . Equation (29) is then: 



But 



AtrB~ l trB ^ 1. 



so 



trB - EX.(B) 



i I S UB) 

-U'B-'tvB = 



Ik & \,(B)J 

The numerator and denominator are respectively the arithmetic and 
harmonic means of the eigenvalues of the B matrix. Hardy, Little- 
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wood, and Polya (p. 26, special case of 2.9.1) show that this ratio 
always exceeds one, except when the eigenvalues are all the same, in 
which case it is one. 17 This proves (29) . 

At equality B has equal eigenvalues, and since it is symmetric the 
eigenvectors span the space and B is proportional to an orthogonal 
matrix: 

B = \U = P _1 (\7)P 

= P'QJ)P 

because B is symmetric 

= \I. 

Therefore AA* = Xl and A' 1 = XA\ so A is proportional to an orthog- 
onal matrix at equality, thereby establishing (30) and completing the 
proof of (29) and (30) , thereby completing the proof of Theorem 1. 

APPENDIX C 

Finding Noise Component 

In the text we discuss the circuit shown in Fig. 6 and state that a 
way to find part of the noise component is to sample at the sampling 
instants, reconstruct the waveform which would be transmitted if 
these sample values were data-signal values, and then subtract this 
signal from the actual received signal. 

The proof of this statement requires the use of the valid converse 
of the sampling theorem, which states that an arbitrary function 
with frequency components out to |a>| = tt/T^ cannot be reconstructed 
from samples every T seconds if T > T t . If it is assumed that 

(0 HO) = 1 
(is) h(nT) = 

(m) H(u>) is nonzero for | a> | < ir/T^ 
(iv) Ti < T 

(v) The additive noise c(t) has components at all frequencies for 
which H(u>) has components, 

then it follows that 

actual sample at t = c(t) + ^ r n h(t — nT) 
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predicted sample at t based upon 

samples at nT, n = 0, ±1, ±2, • • • = £ \T n + c(nT)]h(t - nT) 

n 

A(0 = difference of the above = c(t) - £ c(nT)h(t - nT). 

By using a well-known result in sampling theory 12 the Fourier trans- 
form of A(t) can be written as either of the following. 

A(a>) = tf[A(*)] = C(co) - H(o>) 2 c{nT)e- iunT 

= C(»)-H(u) £c(«-^). 

By the converse to the sampling theorem, no H(u>) will make A(o) 
zero for all to. Consequently, A(a>) contains some components of the 
additive noise. If T x ^ T/2, then the direct sampling theorem shows 
that samples every T/2 are sufficient to reconstruct A(t). 

APPENDIX D 

The purpose of this appendix is to state and prove the following 
theorem. 

Theorem 4-' Assuming 

(i) Channel I has additive noise c independent of the signal b 
(ii) Channel II has additive noise g independent of the signal b 
(Hi) c and g are zero mean, and each is even about its mean 
(iv) F(a) = p(\ c\ ^ a), (a is defined to be nonnegative) 
(v) K(a) -p(\g\£a) 

(vi) In both channels signal plus noise are passed through the memory- 
less nonlinearity nl( ) at the receiver 
(vii) nl(x) is odd 

(viii) nl(x) has a slope bounded between and 1 jor all x, and this slope 
is monotonically decreasing in\x\ 
(ix) The mean square errors of channels I and II are MSE, and MSE n , 

respectively, 
(x) Channel I is noisier than channel II in the sense that F(a) ^ K(a) 
for all a, which means that for every bit of probability density c has 
at ±/3, g has an equal amount at a distance which is at least ±#, 

then 

MSE, ^ MSE n . 
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(Thus, worst-case noise gives worst-case results with these non- 
linearities.) 

The next definition and lemma are used in the proof of Theorem 4. 

Let MS fa) denote the special case of MSE, of Theorem 4 when 



Pi(c) = i«(c + a) + i«(c - a) 



(38) 



where a is a positive constant, and 8( ) denotes the Dirac impulse 
function. 

Lemma 6: Under the conditions of Theorem 4, (BM8(a)/da) > 0. 
Proof: 



MS(a) = [ M r [nl(b + c) - bf Pl (b)p 2 (c) dc db. 

J — CO ■»— CO 



(39) 



Substituting equation (38) for p 2 (c) , integrating with respect to c, and 
then taking partial derivations with respect to a gives 



-^- L = LA [nl{h + a) - b] ^r 



A 



B 



- [nl(b - a) - b] 



dnljx) 
dx 



,(&) db. (40) 



D 



Now 



[assumptions 7, 8] => [C g for 6 ^ 0, A g for & £~0] (41) 

[assumption 8] => [D ^ 0, 5 ^ 0] (42) 

[assumption 8] => [B ^ D when 6 ^ 0, £ ^ D when &| 0], (43) 
Therefore 

-CD ^ -C£ when b ^ (44) 

AB ^ AZ) when b ^ 0. (45) 
Consequently 

^^ £ f " [(A - C)D] Pl (b) db + f [(A - C)B] Pl (b) db. (46) 

Now 

' "</n/(.r) 



Jh-, 



dx 



dx 



(47) 



LINEAR-REAL CODERS 1097 

and by assumption 8 the integrand is nonnegative, so both sides of 
(47) are nonnegative. This fact and (42), and the nonnegativeness 
of pi(b), make the right side of (46) nonnegative, which proves the 
lemma. 

Proof of Theorem 4-' The assumed evenness of the noises, and the 
linearity of the expectation operator, permit the MS (a) function to 
be used to evaluate the mean square error, as follows 

MSE l - MSE n = r MS(<x) dF(a) - f MS(a) dK(a). (48) 

Jo «0 

The above right side can be combined into one integral, such that 
integrating by parts gives zero for the end conditions plus the result- 
ing integral. 

MSE, - MSE n = f [K(a) - n«)] Y M fa a) } da. (49) 

Assumption 10 makes the bracketed term nonnegative, whereas 
Lemma 6 makes the braced term nonnegative, so the right side of 
(49) when integrated is nonnegative, which proves the theorem. 
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